Skip to content

TOML: structured DottedKey AST and path-lookup utilities#7538

Draft
knutwannheden wants to merge 1 commit intomainfrom
toml-dotted-key-ast-path-lookup-utils
Draft

TOML: structured DottedKey AST and path-lookup utilities#7538
knutwannheden wants to merge 1 commit intomainfrom
toml-dotted-key-ast-path-lookup-utils

Conversation

@knutwannheden
Copy link
Copy Markdown
Contributor

@knutwannheden knutwannheden commented May 1, 2026

Motivation

A dotted TOML key like physical.color was previously flattened into a single Toml.Identifier whose name was the joined string of all child tokens. That representation cannot distinguish site."google.com" (two segments, the second containing a literal dot) from site.google.com (three bare segments) — both became the string site.google.com.

The same logical key in TOML can be written in many equivalent ways:

[a.b.c]
x = 1
[a]
[a.b]
[a.b.c]
x = 1
a.b.c.x = 1
[a.b]
c.x = 1

Recipes wanting to find or modify a value at a logical key path each ended up doing ad-hoc traversal that handled only some of these forms.

Examples

A new Toml.DottedKey AST node carries an ordered list of segments:

Toml.KeyValue kv = ...;  // physical.color = "orange"
TomlKey key = kv.getKey();
key.getPath();  // ["physical", "color"]

Quoted segments containing literal dots stay one segment:

// site."google.com" = true
Toml.KeyValue kv = ...;
kv.getKey().getPath();  // ["site", "google.com"]   — length 2

// site.google.com = false
Toml.KeyValue other = ...;
other.getKey().getPath();  // ["site", "google", "com"]   — length 3

A new TomlPaths static utility resolves a logical path to a KeyValue (or Table) regardless of authoring form:

Toml.Document doc = ...;  // any of the four equivalent forms above
Toml.KeyValue found = TomlPaths.findKeyValue(doc, List.of("a", "b", "c", "x"));
// non-null in every case; null when no such key exists

Summary

  • Adds Toml.DottedKey implements TomlKey with List<TomlRightPadded<Toml.Identifier>> segments. Each segment preserves its own prefix/source for round-tripping; right-padding holds the whitespace before the next dot. Dots are emitted by the printer between segments and not stored.
  • Widens Toml.Table.name from TomlRightPadded<Toml.Identifier> to TomlRightPadded<TomlKey> so headers can carry either shape.
  • Adds TomlKey#getPath() returning the canonical unquoted segment list (length 1 for Identifier, N for DottedKey), and TomlKey#getName() returning the dot-joined form. The latter matches existing Identifier.getName() semantics so consumers comparing names as strings keep working unchanged.
  • New org.openrewrite.toml.TomlPaths static utility with findKeyValue and findTable.
  • TomlVisitor.visitTable now visits the table name (previously the name was silently skipped, breaking visitor-based transformations targeting headers).
  • SemanticallyEqual.keyEquals and TomlPathMatcher.buildPath simplified to use getPath().
  • PythonDependencyParser.indexTables adjusted so dotted-header tables (e.g. [tool.uv]) keep being indexed under their joined name.

Builds on the simple-key strip-quotes work landed in #7521.

Test plan

  • TomlParserTest.dottedKeys extended with structural assertions distinguishing site."google.com" (2 segments) from site.google.com (3 segments).
  • TomlParserTest.extraWhitespaceTable extended with structural assertion that [a.b.c] and [ j . "ʞ" . 'l' ] produce 3-segment DottedKey.
  • New TomlPathsTest covering each equivalent authoring form resolving to the same path, quoted-segment-with-dot semantics, missing-path returns null, and array tables not searched.
  • All existing :rewrite-toml:test tests pass (round-trip preservation for every fixture).
  • :rewrite-python:test passes for all TOML-touching tests.

A dotted TOML key like `physical.color` was previously flattened into a
single `Toml.Identifier` whose `name` was the joined string of all child
tokens. That representation cannot distinguish `site."google.com"` (two
segments, the second containing a literal dot) from `site.google.com`
(three bare segments) — both became the string `site.google.com`.
Recipes wanting to find or modify a value by logical key path each
ended up doing ad-hoc traversal that handled only some of the
equivalent authoring forms.

Add `Toml.DottedKey implements TomlKey` with an ordered list of
`Toml.Identifier` segments wrapped in `TomlRightPadded`. Each segment
preserves its own prefix/source for round-tripping, and the
right-padding holds the whitespace before the following dot. The dots
themselves are emitted by the printer between segments rather than
stored. `Toml.Table.name` widens from `TomlRightPadded<Toml.Identifier>`
to `TomlRightPadded<TomlKey>` so headers can carry either shape.

`TomlKey` gains a `getPath()` default returning the canonical list of
unquoted segment names — singleton for a simple `Identifier`, N-element
for a `DottedKey`. A `getName()` default returns those segments joined
with `.`, matching the existing `Identifier.getName()` semantics so
consumers that compare names as strings keep working unchanged.

`TomlPaths` is a new static utility offering `findKeyValue` and
`findTable` over a `Document`. The finder walks the document and
matches a target path regardless of whether the document expressed it
as a flat dotted key (`a.b.c.x = 1`), nested headers
(`[a] [a.b] [a.b.c] x = 1`), `[a.b.c] x = 1`, `[a.b] c.x = 1`, or
nested inline tables. Quoted segments containing literal dots are
treated as one segment.

Also: `TomlVisitor.visitTable` now visits the table name so subclasses
that transform identifiers/dotted keys see headers as well as
key-value keys; previously the name was silently skipped.
`SemanticallyEqual.keyEquals` and `TomlPathMatcher` are simplified to
use `getPath()` directly. `PythonDependencyParser.indexTables` is
adjusted so dotted-header tables (e.g. `[tool.uv]`) keep being indexed.
@knutwannheden knutwannheden force-pushed the toml-dotted-key-ast-path-lookup-utils branch from 8e2ef24 to 83532b4 Compare May 1, 2026 09:26
@knutwannheden knutwannheden marked this pull request as draft May 1, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant